{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "    <p style=\"text-align:center\">\n",
    "        <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n",
    "        <br>\n",
    "        <a href=\"https://arize.com/docs/phoenix/\">Docs</a>\n",
    "        |\n",
    "        <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n",
    "        |\n",
    "        <a href=\"https://arize-ai.slack.com/join/shared_invite/zt-2w57bhem8-hq24MB6u7yE_ZF_ilOYSBw#/shared-invite/email\">Community</a>\n",
    "    </p>\n",
    "</center>\n",
    "<h1 align=\"center\">Using Sessions with LlamaIndex</h1>\n",
    "\n",
    "A Session is a sequence of traces representing a user's interaction with an application.\n",
    "\n",
    "In this tutorial, you will:\n",
    "- Build and trace a simple LlamaIndex application\n",
    "- Use sessions to organize traces\n",
    "\n",
    "ℹ️ This notebook requires an OpenAI API key."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Install Dependencies and Import Libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install -Uq arize-phoenix \"llama-index==0.14.0\" openinference-instrumentation-llama-index gcsfs faker"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import os\n",
    "from getpass import getpass\n",
    "from random import sample\n",
    "from urllib.request import urlopen\n",
    "from uuid import uuid4\n",
    "\n",
    "from faker import Faker\n",
    "from gcsfs import GCSFileSystem\n",
    "from llama_index.core import (\n",
    "    Settings,\n",
    "    StorageContext,\n",
    "    load_index_from_storage,\n",
    ")\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "from llama_index.llms.openai import OpenAI\n",
    "from openinference.instrumentation import using_session, using_user\n",
    "from openinference.instrumentation.llama_index import LlamaIndexInstrumentor\n",
    "from tqdm import tqdm\n",
    "\n",
    "import phoenix as px\n",
    "from phoenix.otel import register"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Configure Your OpenAI API Key\n",
    "\n",
    "Set your OpenAI API key if it is not already set as an environment variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if not (openai_api_key := os.getenv(\"OPENAI_API_KEY\")):\n",
    "    openai_api_key = getpass(\"🔑 Enter your OpenAI API key: \")\n",
    "os.environ[\"OPENAI_API_KEY\"] = openai_api_key"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Configure the default project then Launch Phoenix\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "🚨 Phoenix is configured with environment variables. 🚨\n",
    "\n",
    "In this tutorial we want to change the default project we send traces to by modifying the `PHOENIX_PROJECT_NAME` environment variable defined blow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "os.environ[\"PHOENIX_PROJECT_NAME\"] = \"SESSIONS-DEMO\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Enable Phoenix tracing via `LlamaIndexInstrumentor`. Phoenix uses OpenInference traces - an open-source standard for capturing and storing LLM application traces that enables LLM applications to seamlessly integrate with LLM observability solutions such as Phoenix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tracer_provider = register(endpoint=\"http://127.0.0.1:6006/v1/traces\")\n",
    "LlamaIndexInstrumentor().instrument(skip_dep_check=True, tracer_provider=tracer_provider)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Launch Phoenix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "px.launch_app()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Build Your LlamaIndex Application"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This example uses a `RetrieverQueryEngine` over a pre-built index of the Arize documentation, but you can use whatever LlamaIndex application you like.\n",
    "\n",
    "Download our pre-built index of the Arize docs from cloud storage and instantiate your storage context."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "file_system = GCSFileSystem(project=\"public-assets-275721\")\n",
    "persist_dir = \"arize-phoenix-assets/datasets/unstructured/llm/llama-index/arize-docs/index/\"\n",
    "storage_context = StorageContext.from_defaults(fs=file_system, persist_dir=persist_dir)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are now ready to instantiate our query engine that will perform retrieval-augmented generation (RAG). Query engine is a generic interface in LlamaIndex that allows you to ask question over your data. A query engine takes in a natural language query, and returns a rich response. It is built on top of Retrievers. You can compose multiple query engines to achieve more advanced capability"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Settings.llm = OpenAI(model=\"gpt-4o-mini\")\n",
    "Settings.embed_model = OpenAIEmbedding()\n",
    "index = load_index_from_storage(storage_context)\n",
    "query_engine = index.as_query_engine()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 5. Download Sample Queries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "queries_url = \"http://storage.googleapis.com/arize-phoenix-assets/datasets/unstructured/llm/context-retrieval/arize_docs_queries.jsonl\"\n",
    "with urlopen(queries_url) as response:\n",
    "    queries = [json.loads(line)[\"query\"] for line in response]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 6. Group Queries By User Sessions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "session_id = str(uuid4())\n",
    "session_user = Faker().user_name()\n",
    "\n",
    "with using_session(session_id), using_user(session_user):\n",
    "    for query in tqdm(sample(queries, 3)):\n",
    "        query_engine.query(query)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<video controls src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/docs/notebooks/llama-index-knowledge-base-tutorial/project_sessions.mov\" />"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
